agents,lib,src,test: add traceSampleRate support#430
agents,lib,src,test: add traceSampleRate support#430santigimeno wants to merge 1 commit intonode-v24.x-nsolid-v6.xfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds an optional double ChangesTrace sampling end-to-end
Sequence DiagramsequenceDiagram
participant App as JavaScript App
participant Config as nsolid.config
participant gRPC as gRPC Agent
participant EnvList as EnvList (C++)
participant Binding as binding.trace_sample_rate
participant Tracer as OpenTelemetry Tracer
participant Span as Root Span
App->>Config: updateConfig({ traceSampleRate: X })
Config->>Config: parseTraceSampleRate(X)
Config->>gRPC: send reconfigure (traceSampleRate: X)
gRPC->>EnvList: receive reconfigure
EnvList->>EnvList: validate_trace_sample_rate -> sanitize
EnvList->>EnvList: update_tracing_sample_rate(X)
EnvList->>Binding: write trace_sample_rate (shared buffer)
App->>Tracer: start root span
Tracer->>Binding: read trace_sample_rate
Tracer->>Span: decide sampled = Math.random() < trace_sample_rate
alt sampled
Span->>Span: set traceFlags = SAMPLED
else not sampled
Span->>Span: set traceFlags = NONE
end
Span-->>Tracer: return span
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
test/agents/test-grpc-reconfigure.mjs (1)
171-208: Consider addingNaN/Infinityinvalid-rate cases in this gRPC path test.This block currently checks only out-of-range finite numbers. Extending it with
Number.NaNand±Infinitywould better guard the gRPC reconfigure edge cases already covered in non-gRPC tests.✅ Minimal test extension
- const invalidRates = [2, -0.5]; + const invalidRates = [2, -0.5, Number.NaN, Number.POSITIVE_INFINITY, Number.NEGATIVE_INFINITY];🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/agents/test-grpc-reconfigure.mjs` around lines 171 - 208, Update the test "should preserve previous traceSampleRate for invalid values" to include NaN and Infinity cases: add Number.NaN, Number.POSITIVE_INFINITY and Number.NEGATIVE_INFINITY (or +/-Infinity) to the invalidRates array used with grpcServer.reconfigure and client.config assertions so grpcServer.reconfigure(agentId, { traceSampleRate: invalidRate }) is exercised for NaN/Infinity and the existing assert.strictEqual checks that nsolidConfig.traceSampleRate remained 0.4 still apply.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@agents/grpc/src/grpc_agent.cc`:
- Around line 1761-1763: PopulateReconfigureEvent currently omits the
traceSampleRate field when building the outgoing reconfigure event body; update
PopulateReconfigureEvent to set body.traceSampleRate from the internal config so
the outgoing event mirrors inbound updates (i.e., when you previously read
body.tracesamplerate() on incoming updates, ensure PopulateReconfigureEvent
calls the corresponding setter to populate body.traceSampleRate in the event).
Locate the PopulateReconfigureEvent function in grpc_agent.cc and add the
traceSampleRate assignment using the same source/field used for other mapped
settings so reconfigure responses include traceSampleRate.
In `@lib/nsolid.js`:
- Around line 1144-1154: The parseTraceSampleRate function currently coerces
unintended types via unary +; update parseTraceSampleRate to only accept inputs
that are typeof 'number' or 'string', explicitly reject booleans and other
types, trim string inputs and return undefined for empty/whitespace-only
strings, then convert the trimmed numeric string (or the number input) to a
numeric value and validate with NumberIsFinite and range checks (>=0 && <=1);
ensure the function returns undefined for invalid types, whitespace-only
strings, NaN, non-finite numbers, or values outside the 0–1 range while
returning the numeric rate for valid inputs.
In `@test/parallel/test-nsolid-config-trace-sample-rate-env.js`:
- Around line 15-18: The test currently spreads process.env into the child env
(env: {...process.env, ...envVars}), which can leak ambient NSOLID_* variables
and make the test nondeterministic; update the env construction in
test/parallel/test-nsolid-config-trace-sample-rate-env.js to explicitly filter
out any keys starting with "NSOLID_" (e.g., build a filteredEnv =
Object.fromEntries(Object.entries(process.env).filter(([k]) =>
!k.startsWith('NSOLID_'))) and then use env: { ...filteredEnv, ...envVars } so
the child process is isolated from ambient NSOLID_* variables while still
allowing envVars to override.
---
Nitpick comments:
In `@test/agents/test-grpc-reconfigure.mjs`:
- Around line 171-208: Update the test "should preserve previous traceSampleRate
for invalid values" to include NaN and Infinity cases: add Number.NaN,
Number.POSITIVE_INFINITY and Number.NEGATIVE_INFINITY (or +/-Infinity) to the
invalidRates array used with grpcServer.reconfigure and client.config assertions
so grpcServer.reconfigure(agentId, { traceSampleRate: invalidRate }) is
exercised for NaN/Infinity and the existing assert.strictEqual checks that
nsolidConfig.traceSampleRate remained 0.4 still apply.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (15)
agents/grpc/proto/reconfigure.protoagents/grpc/src/grpc_agent.ccagents/grpc/src/proto/reconfigure.pb.ccagents/grpc/src/proto/reconfigure.pb.hlib/internal/otel/trace.jslib/nsolid.jssrc/nsolid/nsolid_api.ccsrc/nsolid/nsolid_api.htest/addons/nsolid-tracing/test-otel-basic2.jstest/agents/test-grpc-reconfigure.mjstest/fixtures/nsolid-trace-sample-rate-package.jsontest/fixtures/test-nsolid-config-trace-sample-rate-env-script.jstest/parallel/test-nsolid-config-trace-sample-rate-env.jstest/parallel/test-nsolid-config-trace-sample-rate.jstest/parallel/test-nsolid-trace-sample-rate-sampling.js
5048dab to
85d36e4
Compare
85d36e4 to
d4333fe
Compare
d4333fe to
57df292
Compare
57df292 to
ea1123a
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/nsolid/nsolid_api.cc (1)
1251-1256: 💤 Low valuePrefer checking
configinstead ofcurrto avoid redundant propagation.Currently,
update_tracing_sample_rate()is called whenevertraceSampleRateexists in the mergedcurrconfig, even if the incomingconfigdidn't include it. This triggersRunCommandto allEnvInstinstances on every config update—even for unrelated fields.Other fields like
promiseTracking(line 1202) checkconfig.find()instead, which limits action to updates that actually include that field.♻️ Suggested fix
- it = curr.find("traceSampleRate"); - DCHECK(it == curr.end() || it->is_number()); - if (it != curr.end() && it->is_number()) { + it = config.find("traceSampleRate"); + if (it != config.end()) { + // Validation already ran; curr has the sanitized value + auto curr_it = curr.find("traceSampleRate"); + DCHECK(curr_it != curr.end() && curr_it->is_number()); update_tracing_sample_rate(*it); }Alternatively, if re-applying the same rate is acceptable, consider adding a short-circuit in
update_tracing_sample_rate()to skip propagation when the value hasn't changed.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/nsolid/nsolid_api.cc` around lines 1251 - 1256, The current code checks curr for "traceSampleRate" and calls update_tracing_sample_rate even when the incoming config didn't include it; change the condition to check config.find("traceSampleRate") (like promiseTracking does) and only call update_tracing_sample_rate when the incoming config contains that key, so you don't propagate RunCommand to all EnvInsts for unrelated updates; alternatively, if you prefer keeping the check on curr, add an early-return/short-circuit inside update_tracing_sample_rate to detect and skip re-propagation when the effective sample rate value hasn't changed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/nsolid/nsolid_api.cc`:
- Around line 1251-1256: The current code checks curr for "traceSampleRate" and
calls update_tracing_sample_rate even when the incoming config didn't include
it; change the condition to check config.find("traceSampleRate") (like
promiseTracking does) and only call update_tracing_sample_rate when the incoming
config contains that key, so you don't propagate RunCommand to all EnvInsts for
unrelated updates; alternatively, if you prefer keeping the check on curr, add
an early-return/short-circuit inside update_tracing_sample_rate to detect and
skip re-propagation when the effective sample rate value hasn't changed.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 7ba3f980-5ecd-4511-aa15-09c8a7a00c29
📒 Files selected for processing (15)
agents/grpc/proto/reconfigure.protoagents/grpc/src/grpc_agent.ccagents/grpc/src/proto/reconfigure.pb.ccagents/grpc/src/proto/reconfigure.pb.hlib/internal/otel/trace.jslib/nsolid.jssrc/nsolid/nsolid_api.ccsrc/nsolid/nsolid_api.htest/addons/nsolid-tracing/test-otel-basic2.jstest/agents/test-grpc-reconfigure.mjstest/fixtures/nsolid-trace-sample-rate-package.jsontest/fixtures/test-nsolid-config-trace-sample-rate-env-script.jstest/parallel/test-nsolid-config-trace-sample-rate-env.jstest/parallel/test-nsolid-config-trace-sample-rate.jstest/parallel/test-nsolid-trace-sample-rate-sampling.js
✅ Files skipped from review due to trivial changes (7)
- test/fixtures/nsolid-trace-sample-rate-package.json
- agents/grpc/proto/reconfigure.proto
- agents/grpc/src/grpc_agent.cc
- test/parallel/test-nsolid-trace-sample-rate-sampling.js
- agents/grpc/src/proto/reconfigure.pb.h
- test/fixtures/test-nsolid-config-trace-sample-rate-env-script.js
- agents/grpc/src/proto/reconfigure.pb.cc
🚧 Files skipped from review as they are similar to previous changes (7)
- lib/internal/otel/trace.js
- test/parallel/test-nsolid-config-trace-sample-rate-env.js
- test/parallel/test-nsolid-config-trace-sample-rate.js
- test/addons/nsolid-tracing/test-otel-basic2.js
- lib/nsolid.js
- test/agents/test-grpc-reconfigure.mjs
- src/nsolid/nsolid_api.h
| } | ||
|
|
||
| it = curr.find("traceSampleRate"); | ||
| DCHECK(it == curr.end() || it->is_number()); |
There was a problem hiding this comment.
Is DCHECK the best assertion here? I guess we want to log it?
Add end-to-end traceSampleRate handling across config, runtime propagation, tracing decisions, and regression tests. Why: - Enable configurable probabilistic trace sampling with predictable behavior. - Ensure consistent semantics across all config entry points. - Prevent invalid updates from corrupting current sampling behavior. - Keep transaction consistency by deciding sampling at the root span only. What changed: - Added traceSampleRate parsing and normalization in JS config paths with explicit default fallback and finite/range validation in [0, 1]. - Added native config sanitization for traceSampleRate to reject invalid values before merge, preserving previous valid configuration. - Ensured runtime sampling state is synchronized from effective current config after updates to avoid stale shared-memory sample rates. - Added gRPC reconfigure support for traceSampleRate in proto and agent mapping, including generated protobuf updates. - Updated tracing logic so root spans perform the sampling decision and child spans inherit parent traceFlags. - Extended tests for: - invalid value handling (including NaN/Infinity) - env/package bootstrap behavior - partial updates preserving existing traceSampleRate - gRPC invalid-update fallback behavior - sampling behavior at 0%, 50% (tolerance), and 100% - worker-thread sampling behavior - explicit parent/child trace consistency assertions
ea1123a to
d46dcf9
Compare
Add end-to-end traceSampleRate handling across config, runtime propagation, tracing decisions, and regression tests.
Why:
What changed:
Summary by CodeRabbit
New Features
Bug Fixes
Tests